A Comprehensive Review of Arabic Text Summarization

نویسندگان

چکیده

The explosion of online and offline data has changed how we gather, evaluate, understand data. It is frequently difficult time-consuming to comprehend large text documents extract crucial information from them. Text summarization techniques address the mentioned problems by compressing long texts while retaining their essential contents. These rely on fast delivery filtered, high-quality content users. Due massive amounts generated technology various sources, automated large-scale challenging. There are three types automatic techniques: extractive, abstractive, hybrid. Regardless these previous techniques, summaries a way produced human experts. Although Arabic widely spoken language that used for sharing web, limited still immature because several problems, including language’s morphological structure, variety dialects, lack adequate sources. This paper reviews approaches recent deep learning models this approach. Additionally, it focuses existing datasets approaches, which also reviewed, along with characteristics limitations. most often metrics quality evaluation ROUGE1, ROUGE2, ROUGE L, Bleu. challenges encountered during summarizing methods solutions proposed in each approach analyzed. Many have such as golden tokens testing, being out vocabulary (OOV) words, repeating summary sentences, standard systematic methodologies architectures, complexity language. Finally, providing required corpora, improving using semantic representations, rouge abstractive summarization, adopt them studies an demand.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Systematic literature review of fuzzy logic based text summarization

Information Overloadrq  is not a new term but with the massive development in technology which enables anytime, anywhere, easy and unlimited access; participation & publishing of information has consequently escalated its impact. Assisting userslq    informational searches with reduced reading surfing time by extracting and evaluating accurate, authentic & relevant information are the primary c...

متن کامل

The Thinning Problem in Arabic Text Recognition – A Comprehensive Review

The goal of this paper is to present an overview about the thinning problem in Arabic text recognition. Thinning "Skeletonization" is a very crucial stage in the ACR, it simplifies the text shape and reduces the amount of data that needs to be handled and it is usually used as a pre-processing stage for recognition and storage systems. The skeleton of Arabic text can be used for each ...

متن کامل

A survey on Automatic Text Summarization

Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...

متن کامل

Text Summarization as Feature Selection for Arabic Text Classification

Text classification (TC) or text categorization task is assigning a document to one or more predefined classes or categories. A common problem in TC is the high number of terms or features in document(s) to be classified (the curse of dimensionality). This problem can be solved by selecting the most important terms. In this study, an automatic text summarization is used for feature selection. S...

متن کامل

Arabic Text Watermarking: A Review

The using of the internet with its technologies and applications have been increased rapidly. So, protecting the text from illegal use is too needed . Text watermarking is used for this purpose. Arabic text has many characteristics such existing of diacritics , kashida (extension character) and points above or under its letters .Each of Arabic letters can take different shapes with different Un...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Access

سال: 2022

ISSN: ['2169-3536']

DOI: https://doi.org/10.1109/access.2022.3163292